Korpus: msa-my_web_2013

Weitere Korpora

3.12.9 Problems with sentence segmentation - Words ending in a stopword

Most frequent words ending in a stopword. They usually contain uppercase letters as result form missing blanks.

Stopword Concatenated word Frequency of stopword Frequency of concatenated word
Allah InsyaAllah 61219 942
Allah insyaAllah 61219 371
Saya ‘‘Saya 82568 298
Saya ‘Saya 82568 140
Jabatan PegawaiJabatan 59042 119
Malaysia JobsMalaysia 254245 114
Majlis navigationMajlis 61558 114
Majlis KerjayaMajlis 61558 104
Saya ¡°Saya 82568 100
Ini ‘‘Ini 63367 92
Kuala MenengahKuala 69684 91
Malaysia ‘Malaysia 254245 87
Jabatan KementerianJabatan 59042 67
Allah ‘Allah 61219 66
Pada 2012Pada 64152 64
Saya tergugatSaya 82568 59
Malaysia BioMalaysia 254245 57
Datuk ‘Datuk 87978 54
Ia ‘‘Ia 54878 52
Islam ‘Islam 140825 50
362 msec needed at 2018-05-25 23:01